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ssessments matter in education. Testing is nothing new; tests have been around as long 
as school itself. However, over the last 15 years, state assessments have grown to be an 
increasingly central, and often controversial, part of schooling. As states raise their 
standards, it is more important than ever to ask: What is a high-quality assessment? 


This year, more than three million students will graduate 
from public high schools in the United States. 1 For the vast 
majority of these students, high school cannot be the end 
of their education. By 2020, the number of jobs requiring 
postsecondary education is predicted to reach 65 percent. 2 
According to one analysis, over a lifetime, workers with a 
college degree earn, on average, over one million dollars 
more than those with only a high school diploma. 3 
Employers increasingly require employees who have the 
ability to creatively solve non-routine problems, think critically, 
and communicate clearly. 


Current evidence indicates that too few students leave 
high school prepared for postsecondary education and 
the world of work. Only one in four students graduate 
from high school having met all four of the ACT College 
Readiness Benchmarks — a measure of preparedness for 
credit-bearing university and community college courses. 4 
Employers regularly struggle to find employees with the 
knowledge and skills necessary to be successful. More than 
one in five students fail to pass the assessment required 
for U.S. Army enlistment. 5 Troublingly, achievement gaps 
and inequity persist across lines of race, ethnicity, socio- 
economic status, and disability. 6 
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In response to the need to improve outcomes for all students, states around the country have taken a first step: adopting 
higher subject matter or content standards aligned with the demands of college and career. Yet setting these new 
standards does not guarantee improved student outcomes and equity; a vital next step is adopting new assessments to 
measure the attainment of them. This brief will help state policymakers think about adopting and implementing high- 
quality, college and career ready assessments through the lens of a small set of policy considerations. 


Defining Assessment 

Assessments measure what a student knows and is able 
to do. Assessments range from informal (asking students 
questions in class) to formal (a college entrance exam). 

State Summative Assessments are typically given at 
the end of the year and are used for school and district 
accountability. In many cases, educator evaluations 
and student promotion or graduation decisions may be 
informed by these assessment results. 

District and School Assessments, often referred to as 
interim or benchmark assessments, are given periodically 


throughout the year to determine progress against 
academic goals. These assessments are commonly 
purchased or developed by districts or schools and not 
the state. 

Classroom Assessments are part of a teacher’s instruction 
(like teacher-made tests and in-class questioning) 
including formative assessment which is the practice of 
regularly assessing student learning during instruction to 
inform and improve teaching and learning. 


In 2013, 38 percent of i2 th -graders nationally scored at a level on the National Assessment of 
Educational Progress reading assessment that indicates they possess the knowledge, skills, and 
abilities in reading that would make them academically prepared for college. 7 


In 2013, 39 percent of i2 th -graders nationally scored at a level on the National Assessment of 
Educational Progress mathematics assessment that indicates they possess the knowledge, skills, and 
abilities in mathematics that would make them academically prepared for college. 8 
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CONSIDERATIONS FOR HIGH-QUALITY ASSESSMENTS 

Four considerations are outlined in this brief. While not exhaustive, all four considerations are critical to adopting improved 
college and career ready assessments. Additionally, assessments have important cost implications; this brief addresses 
cost and discusses the trade-offs between cost and quality in each section. 


ALIGNMENT 

What is taught must be what is tested... 
and vice versa. 


DELIVERY 

Soon, pencil and paper tests will largely 
be a thing of the past. 


INSTRUCTIONAL VALUE 

Attend to assessment in September 
...not just June. 


IMPACT 

Assessments matter — 
for accountability and for instruction. 



ALIGNMENT What is taught must be what is tested... and vice versa. 


Educational standards define the knowledge and skills 
students need to achieve. Assessments measure students’ 
grasp of the knowledge and mastery of the skills. 
Assessments must “align” to the content or subject matter 
standards. Ifthe standard says a student in third grade 
will be able to multiply and divide within 100, then the 
assessment for that grade should include questions that 
require students to do exactly that. 

A recent study of state summative assessments found 
that approximately half of the knowledge and skills in the 
content standards were not reflected on the assessments. 9 
Teachers also voice concerns about assessments. In a 
2012 Scholastic survey, only 26 percent of teachers said 
they believed the results of state standardized tests are an 
accurate reflection of student achievement, and only 28 
percent found their state assessments to be an important 
measure of student academic achievement. 10 


Why does alignment matter? First, fairness: Students 
need to be assessed on content they are taught, and 
teachers need to be able to see the content standards 
reflected in the assessments their students take. Second, 
misaligned assessments will produce unrepresentative and 
potentially misleading data — a problem for use in both 
instruction and accountability. When tests and standards 
are misaligned, it sends mixed signals to teachers. Because 
the state tests are tied to accountability, teachers often feel 
pressure to “teach to the test.” In many respects, the state 
summative assessment becomes the de facto content 
standard and drives the instructional agenda, so the 
quality of the state summative assessments will influence 
the substance of the instruction. All efforts to establish 
new college and career ready standards can either be 
propelled forward or significantly set back by the quality of 
the assessments. 
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ESTABLISHING NEW PROFICIENCY 
CUT SCORES 

No Child Left Behind (NCLB, 2001) requires states to 
administer assessments and establish assessment 
cut scores that distinguish different levels of student 
performance - basic, proficient, and advanced. The 
proficiency level is important because the provisions of 
NCLB hold states accountable for moving all their students 
to proficiency. Proficiency scores signal what is “good 
enough” at each grade level. But what is “good enough” 
has not been the same from state to state. For instance, 
it has been harder for students to score proficient in 
Massachusetts than in Georgia. 11 Importantly, a recent 
study of comparable state results found higher cut scores 
correlated with higher student achievement , 12 

States that raise standards will want to establish 
proficiency cut scores on new state assessments that 
reflect high expectations. New assessments should report 
results that give students, educators and parents a clear 
sense of where a student is on the path to college and 
career readiness and establish meaningful, proficiency cut 
scores benchmarked against national and international 
expectations. While instituting new, more rigorous 
assessments has traditionally resulted in assessment 
scores dropping in the short term, this change should 
anchor the expectations in the rigorous, real-world goal of 
getting students prepared for life. 

ASSESS HIGHER-ORDER THINKING 

In Tough Choices or Tough Times, the New Commission 
on The Skills of The American Workforce asserts, “If 
someone can figure out the algorithm for a routine job, 


The Case for Comparability 

While current state tests provide 
useful information, they do not provide 
information on how student achievement 
compares to other states. Because all 
states are committed to college and career 
readiness, similar standards, assessments 
and cut scores might be used across state 
lines. States could then compare results. 
Comparable assessments tell a clearer 
story about achievement, help further 
educational research and reinforce the 


chances are that it is economic to automate it.” 13 One 
long-standing definition explains higher-order thinking as 
“non-algorithmic...that is, the path of action is not fully 
specified in advance.” 14 The ability to solve novel, non- 
routine problems is of increasing value in the workplace, 
and thinking critically, including the ability to evaluate, 
synthesize, and create, is central to college and career 
ready content standards. Yet, higher-order thinking has 
proven challenging to measure with current assessments. 
A recent study of current state tests found that only two 
percent of mathematics items and one out of five English 
language arts items assessed higher-levels of cognitive 
demand. 15 The requirements of new content standards 
and advances in assessment technology should motivate 
test developers to create new assessments that measure 
deeper learning. Assessment developers must be able to 
produce convincing evidence that their assessments will 
require students to demonstrate higher-order thinking. 


NAEP Proficiency Scores versus State Proficiency Scores 


NAEP 


State Average 


The National Assessment of Educational Progress (NAEP) — often 
referred to as the nation’s report card — is a nationally representative 
assessment given to measure the nation’s educational progress. When 
proficiency rates on state exams are compared with proficiency rates on 
the eighth-grade NAEP, a gap in expectations is readily apparent. The gap 
is 40 percent in English language arts and 32 percent in mathematics. 

* Alliance for Excellent Education. (2013). High School State Cards: National. Washington, DC. Avaiilable at: 
http://all4ed.org/wp-content/uploads/2013/09/UnitedStates_hs.pdf 
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ASSESS KNOWLEDGE AND SKILLS WITH 
WRITING AND PERFORMANCE TASKS 

The majority of current state assessments use 
predominantly multiple-choice questions; they are cheap, 
can quickly assess a large amount of content, and are easy 
to score. Many states use only multiple-choice questions 
on some of their summative assessments .' 6 The reliance 
on multiple-choice questions is unpopular with many 
educators and seen as too often measuring lower-level, 
recall skills rather than higher-order thinking. For instance, 
testing how well a student can compose a well-reasoned 
argument in support of a claim will require a student to 
write. Additionally, written responses can reveal aspects 
of a student’s thinking and reasoning that multiple-choice 
questions may not. 

Content standards that call for research, communication, 
or complex problem-solving require performance tasks 
to be measured accurately . 17 Performance tasks present 
students with real problems. For instance, students might 
be presented with background information on a problem 
- like health care costs - and have to analyze competing 
points of view on the topic, devise a course of action, and 
develop a presentation to argue for that course of action. 

The jobs graduates will apply for will often depend on the 
clarity, accuracy, and persuasiveness of writing; universities 
and community colleges require strong writing skills for 
success . 18 The real world requires working on tasks that 
defy easy answers. College and career ready assessment 
systems should incorporate a wide variety of assessment 
item types including items that require students to write 
and perform authentic, real-world tasks. 


ALIGNMENT Cost vs. Quality 

A recent Report from the Government 
Accountability Office (GAO), indicates that states 
in the last 10 years have shifted toward multiple- 
choice questions and away from open-ended, 
constructed response questions largely as a cost 
and time-saving measure* Requiring students to 
construct their responses (e.g. write or design) 
on a summative assessment currently requires 
at least some human scoring. Scorers must be 
hired to evaluate student responses, and this 
typically costs more money than machine-scoring 
a multiple-choice bubble sheet. Additional costs 
might be anticipated as well for items that require 
development of computer simulations or extended 
performance tasks that require outside scorers. 

The cost of assessments varies widely across 
states ranging from $13 to $105 per student, 
with an average of $27 per student. Only a small 
portion of current K-12 education spending goes 
toward assessments — one-quarter of one percent. 
Analyses have shown the largest influence on cost 
per student is the total number of students in the 
state; the more students in a state, the less the cost 
per student. This is a result of fixed costs being 
distributed over a larger number of students and 
larger states having bigger contracts and more 
bargaining power.** 

While high-quality assessments will cost more, a 
recent cost analysis indicates that by teaming with 
other states and implementing additional cost- 
saving measures, states may be able to implement 
higher-quality assessments that measure more 
accurately the demands of college and career.*** 

* Government Accountability Office. (2009). Enhancements in the Department of 
Education's Review Process Could Improve State Academic Assessments. Available 
at: http://www.gao.gov/new.items/d09911.pdf 

** Chingos, Matthew M. (2012). Strength in Numbers: State Spending on 
K-12 Assessment Systems. Washington, DC: Brown Center on Education 
Policy at Brookings. Available at: http://www.brookings.edu/Wmedia/research/ 
files/reports/20 12/1 1/2 9%20cost%20of%20assessment%20chingos/ll_ 
assessment_chingos_final.pdf. 


*** Topol, 8 ., Olson, J., & Roeber, E. (2010). The Cost of New Higher Quality 
Assessments: A Comprehensive Analysis of the Potential Costs for Future State 
Assessments. Stanford, CA: Stanford University, Stanford Center for Opportunity 
Policy in Education. Available at: https://edpolicy.stanford.edu/publications/ 
pubs/120 
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DELIVERY Soon, pencil and paper tests will largely be a thing of the past. 


Digital technologies hold great promise for 
helping to bring about many of the changes 
in assessment that the Commission believes 
are necessary. Technologies available 
today and innovations on the immediate 
horizon can be used to access information, 
create simulations and scenarios, allow 
students to engage in learning games and 
other activities, and enable collaboration 
among students. Such activities make it 
possible to observe, document and assess 
students’ work as they are engaged in 
natural activities — perhaps reducing the 
need to separate formal assessment for 
accountability from learning in the moment. 

THE GORDON COMMISSION ON THE FUTURE OF ASSESSMENT. (2013) 

A Public Policy Statement 


Technology is transforming life. Education is no 
exception. Instruction and assessment delivered via 
computers and other electronic devices offer new 
possibilities for personalizing and improving learning. 
In many ways, technology may blur the lines between 
assessments and instruction. 

Increasingly, the ability to analyze and think critically 
about media (typically with text, video, and graphics) 
is a vital skill for college and work. The use of video, 
audio, voice-capture, interactivity and simulations are 
all possible and may prove integral to developing more 
aligned assessments. Online delivery of assessments 
can assist educators in very practical ways and save 
time. For instance, scoring is made significantly easier 
because multiple-choice questions can be scored 
instantaneously without scanning bubble sheets, and 
writing samples can be distributed to scorers quickly 
and inexpensively. 

Finally, moving to online assessments positions 
schools for future innovation in both assessment 
and instruction. Imagine assessments that require a 
student to program code, voice an online presentation 
using digital media, or work in a simulated laboratory. 
Question types delivered via digital devices will likely 
evolve over time to yield clearer insight into student 
thinking .' 9 


States have been administering online 
assessments since as early as 2001 when 
Virginia began its online assessment 
program. Between 2001 and 2012 
thirty-two other states shifted partially or 
wholly to online assessment. 


State Educational Technology Directors Association website. (2014) 
Available at: http://www.setda.org/priorities/online-assessment/ 
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DELIVERY Cost vs. Quality 


Virginia’s Standards of Learning (SOL) Technology 
Initiative set the goal of establishing a computer-based 
instructional and testing system, and administering all 
state assessments online by fiscal year 2013. The initiative 
cited improved efficiency of data collection, increased 
accuracy of data, and increased test security among other 
reasons for setting these goals and allocating resources to 
the initiative. 

http://www.doe.virginia.gov/testing/ 


Idaho, with many small, rural districts, used the Idaho 
Education Network to bring high-speed broadband to every 
school, then focused efforts on professional development 
around technical skills at the local level to ensure schools 
were able to successfully deliver online assessments. 

http://assessmentstudies.setda.org/casestudies/idaho/#!/history-and- 

background 


The costs of online delivery are challenging to 
estimate. For many states, moving to device- 
enabled assessments will require significant 
investments up front, including devices and 
infrastructure. However, online delivery can also 
save money by eliminating the need for printing and 
shipping or by automating more scoring. 

In weighing costs and benefits, policymakers will 
want to consider that many of the instructional 
advantages are even more compelling than the 
significant advantages of delivering assessments 
on devices. States (and districts within states) vary 
widely in their readiness for the delivery of online 
assessments and digital learning. An inventory 
of devices, connectivity, and readiness is a good 
starting place when considering the costs to move 
to online assessments. 

BUYER BEWARE: While the use of technology 
can improve assessments through features like 
computer adaptive technology or interactivity, 
there is very little inherent value in delivering an 
assessment on a computer or digital device. For 
instance, taking standard multiple-choice test items 
and transferring them onto a computer doesn’t 
make a better assessment. 


While technology implementation can require an outlay of up-front and ongoing 
costs, digital learning and technology can also provide more efficient use of human 
and fiscal resources, increase the productivity of teachers and administrators, and, 
most importantly, create conditions that raise student academic outcomes. 


Alliance for Excellent Education. (2012). The Digital Learning Imperative: How Technology and Teaching Meet Today's Education Challenges. 

Available at: http://all4ed.org/wp-content/uploads/2012/01/DigitalLearninglmperative.pdf 
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SB INSTRUCTIONAL VALUE Attend to assessment in September...not just June. 


NCLB requires that students be tested annually using 
state summative assessments (see chart). 20 Schools 
are accountable for results on annual summative 
assessments, but make local instructional decisions. 
Schools and districts create or adopt their own curriculum 
and instructional materials. Because summative tests are 
primarily used for accountability and are taken at the end 
of the year, they do not provide ongoing, fine-grained data 
on student learning necessary for instructional purposes. 

In a balanced assessment system, teachers have the 
tools to check for understanding during the year in order 
to provide students with regular descriptive feedback 
and adjust their instruction to meet all students’ needs. 
Local control is honored, and schools and districts use 
formative and interim assessments when and where it 
is important for instruction. States have not typically 
funded assessments for learning (like flexible classroom 
assessment tools) to the same degree as summative 
assessments; yet, these assessment practices are the most 
frequently used and instructional^ valuable. 21 

When considering balanced assessment systems, 
states should pay the same attention to the quality of 
district, and classroom assessment tools as they do to 
summative assessments. Too often, low-quality summative 
assessments have led to a cascade of low-quality, 
district, interim assessments. In considering assessments, 
policymakers should look for comprehensive, integrated 
and aligned systems that include flexible, high-quality tools 
for district, school and classroom purposes, as well as for 
state summative purposes. 


No Child Left Behind Testing Requirements 


Grade 

3 

4 

5 

6 

7 

8 

9 

10-12 

Math 

X 

X 

X 

X 

X 

X 


Once in 
this span 

English 

X 

X 

X 

X 

X 

X 


Once in 
this span 

Science 

Once in 
this span 

Once in 
this span 

Once in 
this span 


Each "x" represents one summative assessment, typically at the end of the 
year. All assessments have alternatives for students with the most severe 
disabilities, h ttD://www2.ed.aov/nclb/accountabi I i tv/avp/testing-faq. htm!#4 


INSTRUCTIONAL VALUE 
Cost vs. Quality 

Enabling the use of a balanced assessment system 
has a variety of cost implications. They might 
include: 

• Considering contracts with summative 
assessment 

vendors that include high-quality tasks and tools that 
are usable by schools and districts at their own 
discretion. 

• Supporting professional development in 
diagnosing student needs and using data to 
improve instruction. 

• Developing online spaces for collecting and 
sharing teacher best practices around assessment 
for learning. 
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BS IMPACT Assessments matter — for measurement and for instruction. 


Assessments matter in education. They impact teachers, principals, parents, policymakers and, most importantly, 
students. The use of assessment in educational policy is rarely a home run; more often, pros and cons need to be 
considered. The table below discusses a few of the impacts to be considered as states move to college and career ready 
assessments. 


Impact 

Establishing new 
proficiency cut 
scores 

Intended Impact: Ensure teachers and students work toward goals that prepare students 
for college and career, and hold high standards for all, regardless of race, class or ethnicity. 

Other Consequences: Instituting new, more rigorous assessments has traditionally 
resulted in assessment scores dropping in the short term. Because of this, policies 
like exit exams (for students), evaluations using achievement data (for teachers 
and principals), and accountability categories (for schools and districts) should be 
considered and, if necessary, adjusted in advance of the implementation of the new 
assessments. 

Improving 

assessment 

alignment 

Intended Impact: Create assessments that accurately measure achievement of college 
and career ready standards; ensure that teachers and students believe the assessments 
are worth taking; incentivize instruction designed to teach the full range of important 
knowledge and skills and not a narrowed set of skills dictated by a narrow assessment. 


Other Consequences: The delivery of the assessments with writing and performance 
tasks may take more time than past assessments that relied almost wholly on multiple- 
choice questions. In some cases, new assessments may cost more to deliver than 
previous assessments; students and teachers may find these tests more challenging and 
will need support in implementing the new assessments. 

Delivering 
assessments on a 
device instead of 
pencil and paper 

Intended Impact: Deliver assessments aligned to standards by taking advantage of richer 
item types and use of media; save time and/or money on scoring; set students up for 
technology-enabled learning and assessment in the future. 

Other Consequences: Moving to online delivery may be challenging for some rural or 
under-resourced schools and require system updates; ensuring enough devices are 
available may require reallocation of monies and/or securing of new monies to support 
the transition. 

Instituting a 

instructional^ 

valuable 

assessment 

system 

Intended Impact: Enable teachers to diagnose student need throughout the school year 
to guide instruction. 

Other Consequences: Ensure that educators have the skills to use a balanced system 
effectively and efficiently, and in such a way that they are a seamless part of instruction 
and not “another test.” 
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A Note on Time Spent Testing 

Currently, many stakeholders are concerned about the amount of time spent testing in schools. Critics argue 
that curriculum is narrowed, districts give assessments that mimic narrow state summative tests, and this 
environment is at odds with the needs of students. Policymakers will do well to heed these concerns and 
ensure that every assessment has a clearly defined purpose, is not duplicative, and is likely to encourage great 
instruction. Because the amount of time spent testing is at least as much a local decision as a state-level 
one, professional development is an important component of ensuring high-quality assessments are used in 
purposeful, efficient ways that help students and create a positive learning environment. The Ohio Department 
of Education released a report in January of 2015 indicating that students spend approximately 19.8 hours 
testing per year, which equates to between one and three percent of the school year, depending on grade-level. 


CONCLUSION 

The four areas discussed in this brief — alignment, 
delivery, instructional value and impact — are not 
the only components of assessment quality. Among 
the considerations beyond the scope of this brief are 
technical quality, accessibly for all test-takers (including 
English language learners and students with disabilities), 
stakeholder involvement in development, quality and 
clarity of reporting results and transparency of testing 
support materials. 

Given the scope and complexity of this work, calling on 
the involvement and sign-off of impartial assessment 
experts is an important part of adopting high-quality 
assessments. Happily, a number of resources are, or will 
soon be, available to assist policymakers in selection of 


assessments and assessment systems. The Standards 
for Educational and Psychological Testing (AERA, APA, &, 
NCME) are the gold standard for guidance on testing, and 
the revised edition was released recently. The Council of 
Chief State School Officers’ (CCSSO) Criteria For Procuring 
And Evaluating High-Quality Assessments'^ an important 
guide for the development of college and career ready 
assessments. The Center for Assessment, a nonprofit 
with expertise in large-scale educational assessment and 
accountability, is developing and will release a framework 
based on the CCSSO criteria for evaluating college and 
career ready assessments to be used by states and others 
to do objective, expert analysis of available college and 
career ready assessments. 
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